Experiments in Parallel Clustering with DBSCAN

نویسندگان

  • Domenica Arlia
  • Massimo Coppola
چکیده

We present a new result concerning the parallelisation of DBSCAN, a Data Mining algorithm for density-based spatial clustering. The overall structure of DBSCAN has been mapped to a skeletonstructured program that performs parallel exploration of each cluster. The approach is useful to improve performance on high-dimensional data, and is general w.r.t. the spatial index structure used. We report preliminary results of the application running on a Beowulf with good efficiency.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

بررسی مشکلات الگوریتم خوشه بندی DBSCAN و مروری بر بهبودهای ارائه‌شده برای آن

Clustering is an important knowledge discovery technique in the database. Density-based clustering algorithms are one of the main methods for clustering in data mining. These algorithms have some special features including being independent from the shape of the clusters, highly understandable and ease of use. DBSCAN is a base algorithm for density-based clustering algorithms. DBSCAN is able to...

متن کامل

Clustering Research across Tibetan and Chinese Texts

Tibetan text clustering has potential in Tibetan information processing domain. In this paper, clustering research across Chinese and Tibetan texts is proposed to benefit Chinese and Tibetan machine translation and sentence alignment. A Tibetan and Chinese keyword table is the main way to implement the text clustering across these two languages. Improved Kmeans and improved density-based spatia...

متن کامل

A Robust Density-Based Clustering Approach Using DBCURE –MapReduce Techniques

Clustering is the process of grouping similar data into clusters and dissimilar data into different clusters. Density-based clustering is a useful clustering approach such as DBSCAN and OPTICS. The increasing volume of data and varying size of data sets lead the clustering process challenging. So that we propose a parallel framework of clustering with advanced approach called MapReduce. We deve...

متن کامل

The DBSCAN Clustering Algorithm by a P System with Active Membranes

The great characteristic of the P system with active membranes is that not only the objects evolve but also the membrane structure. Using the possibility to change membrane structure, it can be used in a parallel computation for solving clustering problems. In this paper a P system with active membranes for solving DBSCAN clustering problems is proposed. This new model of P system can reduce th...

متن کامل

Survey and Performance Evaluation of DBSCAN Spatial Clustering Implementations for Big Data and High-Performance Computing Paradigms

Big data is often mined using clustering algorithms. Density-Based Spatial Clustering of Applications with Noise (DBSCAN) is a popular spatial clustering algorithm. However, it is computationally expensive and thus for clustering big data, parallel processing is required. The two prevalent paradigms for parallel processing are High-Performance Computing (HPC) based on Message Passing Interface ...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2001